Improvement of Monte Carlo estimates with co variance-optimized finite-size scaling at fixed 

phenomenological coupling 
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In the finite-size scaling analysis of Monte Carlo data, instead of computing the observables at fixed Hamil- 
tonian parameters, one may choose to keep a renormalization-group invariant quantity, also called phenomeno- 
logical coupling, fixed at a given value. Within this scheme of finite-size scaling, we exploit the statistical 
covariance between the observables in a Monte Carlo simulation in order to reduce the statistical errors of the 
quantities involved in the computation of the critical exponents. This method is general and does not require 
additional computational time. This approach is demonstrated in the Ising model in two and three dimensions, 
where large gain factors in CPU time are obtained. 
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The Monte Carlo (MC) method is a powerful numerical 
technique which allows to study models with a large num- 
ber of degrees of freedom jl] 0]. One of the main features of 
the method is its flexibility: arbitrary statistical systems can 
be investigated in a regime where other methods (e.g., pertur- 
bative calculations) may not be available or efficient. For this 
reason it is one of the most used tools in critical phenomena 
Jit]. MC integration allows to compute quantities with, at least 
in principle, arbitrary precision. The accuracy of the results is 
proportional to the inverse square root of the computational 
time, hence even a small improvement in the statistical error 
bars can represent a significant gain in terms of computational 
resources. Several strategies have been developed in order to 
improve the accuracy: one of the most common methods is the 
use of improved estimators, which have the same expectation 
value of the desired quantities, but reduced error bars (see, 
e.g., Ref. yfl). More recently, it has been proposed to use a 
covariance analysis in order to compute the optimal weighted 
average of different estimates of a critical exponent [4] and to 
reduce the statistical error of an observable by adding a con- 
trol variate, whose mean value vanishes |5|]. 

In the context of numerical investigations of critical phe- 
nomena, finite-size scaling (FSS) techniques play an impor- 
tant role: they allow to extract the critical properties of a 
model from the data obtained in a region of parameters where 
the correlation length £ is of the order of the linear size L 
of the system. In the FSS analysis of MC data, instead of 
computing the observables at fixed Hamiltonian parameters, 
one can choose to fix a renormalization-group (RG) invariant 
quantity R(f3,L), also called phenomenological coupling, at 
a given value Rf . In this method, introduced in Ref. |6|] and 
discussed also in Refs. [71 la], one determines for every L an 
inverse temperature /?/(£) such that R((3 = f3f(L), L) = Rf. 
Then all the observables are computed at f3 = f3f(L) and 
the statistical errors can be estimated using standard Jack- 
knife techniques (see, e.g., Ref. jgi]). It is easy to show that 
/3f(L —> oo ) — > f3 c , with (3 C the inverse critical temperature. 
It has been observed that, thanks to cross correlations, this 
method gives in some cases reduced error bars (|7l H|. Error 



reduction has been reported also in the related FSS method of 
phenomenological renormalization, where two system sizes 
are enforced to share a common value of R — £/L 11 1 Qfl . 

In this work we study the improvement of the accuracy of 
MC estimates by considering FSS at fixed phenomenological 
coupling, to be chosen as a linear combination of a given set 
of phenomenological couplings. By exploiting the covariance 
between the observables, we show how to choose the combi- 
nation which minimizes the statistical errors of the quantities 
involved in the computation of the critical exponents. We il- 
lustrate this method by considering the Ising model in two and 
three dimensions simulated using, separately, Metropolis and 
Wolff single-cluster algorithms. 

Finite-size scaling at fixed phenomenological coupling A 
MC simulation consists of a Markov chain whose equilibrium 
distribution is the Boltzmann-Gibbs measure [1]. Any observ- 
able is obtained using a statistical estimator, i.e., a random 
variable whose expectation value is the thermal average of the 
observable. Since all the observables are calculated from the 
same configurations generated in the simulation, they are in 
general correlated with each other. We denote a random vari- 
able with a hat A, its expectation value by E[A] = A, and its 
fluctuation around the expectation value by 5 A = A — A. In 
what follows, the dependence of the observables on the size of 
the system is implicit. Let us consider a generic RG-invariant 
quantity R, which is sampled by an estimator R, and that we 
fix to the value Rf\ 



R(Pf) = Rf- 
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The actual computation of an estimate of R, and hence of (3f, 
involves an average over the entire MC run, while the statisti- 
cal error bars resulting from the fluctuations SR, 5(3f can be 
estimated using the Jackknife technique. This method consists 
of dividing the set of MC data into A^in consecutive blocks of 
equal size, which has to be much larger than the autocorrela- 
tion time, and then constructing Abi n Jackknife blocks by tak- 
ing the set of all data except one block. From the data of every 
single Jackknife block, one computes an estimate of R and of 
/3/ by solving Eq. (Q]), as well as an estimate of any other ob- 
servable O at the inverse temperature j3f, as determined from 
the same Jackknife block. This procedure results in Abi n (cor- 



related) estimators Of of the observable O at fixed R = Rf, 
from which the statistical error bars are computed using the 
standard Jackknife estimator [9]: 

?<„,,_ £(8jP- 8,) , (2, 

iv bin . ' 

where 0/ is an estimator of O at fixed R = Rf. Note that 

a 2 (Of) is itself a random variable, whose expectation value 
is the sought variance. Solving Eq. ((TJ requires an extrapola- 
tion of the MC data sampled at /3 run to j3 — j3f, which can 
be done by using reweighting techniques or by computing a 
Taylor expansion of the observables on j3. This in practice 
requires /3/ — j3 run , or equivalently R ~ Rf. Employing a 
first-order Taylor expansion of R((3f) around /3 run , (3f is ob- 
tained as 

Pf — Prun — - , (3) 

R' 

where R 1 is an estimator of the derivative of R with respect to 
f3. All the other observables are then computed at the inverse 
temperature f3f. From an observable O sampled by the esti- 
mator O we obtain an observable Of, which is sampled by the 
random variable Of. Using Eq. ©, Of is given by 

Of-d-O ^Z? 1 , (4) 
R' 

where O' is an estimator of the derivative of O with respect to 
f3 and as before we have used a first-order Taylor expansion 
of O around /3 mn . Expanding Eq. (0]l around the expectation 
value we obtain, to the first order in the fluctuations, 

^ ^ O' (SO'O' — \ 

S 0f cSO--SR-(R-R f )l — -^6R'J. (5) 

The second term in Eq. (0 represents the modification in the 
fluctuation of the observable O which is due to the fact that 
we have fixed R, while the third term is related to the vari- 
ation in the expectation value due to the extrapolation of the 
MC data to (3 — (3f. Since, as mentioned above, R ~ Rf, 
this last term is a correction, which can be neglected in first 
approximation. In the presence of TV RG-invariant quantities 
{Ri}, i = 1,. . . N, we can apply the above method by keeping 
fixed a generic linear combination of {Ri}, R = ^tXiRi. In 
this case generalizing Eq. (|5]l and neglecting its last term, we 
obtain 

50 f ({X t }) = Sd- °'^f\ X ^ R \ (6) 

Now we consider the following problem: find the coefficients 
Aj which minimize the variance of Of({Xi}). Notice that 
Eq. (|6|l is invariant under a global rescaling of the coefficients 
Aj — > juAj, [i 7^ 0: this simply reflects the equivalence of 
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FIG. 1. Decomposition of the vector SO into two com- 
ponents orthogonal SO± and parallel <50|| to the space 
V = Span < SRi, . . . , SRn >■ The affine subspace W is given by 
Eq. which represents the normalization on the parameters {Xi}. 
The minimum length for SO\\ — Sv is obtained when it is orthogonal 
toW. 

fixing R or /iR . It is convenient to fix the normalization by 
requiring that 

N 

so that we have to minimize the variance of 

N 

SOf 

.norm {{Xi}) = 50-J2^Ri (8) 

with the constraint of Eq. Q. The subscript norm in Eq. ([8]) 
reminds that SO /, n0 rm is chosen with the normalization given 
in Eq. (0. The optimal {A;} parameters can be found us- 
ing the method of Lagrangian multipliers. Here, we point out 
a geometrical interpretation of the problem which allows to 
find the solution immediately. The space of random variables 
with null expectation value is a real vector space. A positive- 
definite scalar product < , > between two vectors SA and SB 
can be defined by the covariance COV[iL4, SB] 

<SA,SB>= COV[5A,5B] = E[SASB] - E[SA]E[SB], 

(9) 

The square length of a vector corresponds to the variance of 
a random variable. Let V = Span < 6Ri, . . . , SRn > be 
the A^-dimensional subspace generated by {SRi}. According 
to Eq. ([8]), SO /.norm is obtained by subtracting to SO a vector 
Sv = Y^,i ^iSRi € V, whose components in the (nonorthogo- 
nal) base {SRi} are given by the coefficients {A^}. By decom- 
posing SO into two components orthogonal SO± and parallel 
SOh to V, one finds that the variance of SO /, n orm is given by 

VAR[(0 /jnorm ] = VAR[<0I] + VAR[<50jj - 5v] . (10) 

Furthermore, the constraint of Eq. ([7} represents an affine sub- 
space W of dimension AT — 1, to which Sv belongs. Min- 
imizing the variance of <50/ jnorm consists in finding a vec- 
tor Sv S W such that the length of 50u — Sv is minimum. 
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L 


X 


(dC/ 4 /d/3)/L d (dU 6 /d(3)/L d (di? 5 /d/3)/L d 


Metropolis dynamics 








8 


41.153(99) 


-0.06617(99) 


-0.1881(30) 


0.18353(90) 


imp 


40.878(29) 


-0.06490(17) 


-0.18935(44) 


0.18074(37) 


gain 


11.5 


35.4 


47.6 


6.0 


16 


140.60(60) 


-0.03193(93) 


-0.0909(28) 


0.09045(73) 


imp 


138.90(14) 


-0.03285(14) 


-0.09475(38) 


0.08836(34) 


gain 


18.4 


43.6 


54.2 


4.7 


32 


470.8(2.6) 


-0.01685(61) 


-0.0482(19) 


0.04490(42) 


imp 


469.71(49) 


-0.016369(83) 


-0.04714(23) 


0.04409(20) 


gain 


27.2 


54.9 


68.2 


4.5 


64 


1578.5(5.8) 


-0.00812(20) 


-0.02334(61) 


0.02151(16) 


imp 


1578.5(1.6) 


-0.008121(42) 


-0.02336(12) 


0.02149(10) 


gain 


13.3 


22.9 


27.5 


2.6 


128 


5284(28) 


-0.00433(15) 


-0.01251(47) 


0.01097(11) 


imp 


5316.8(6.0) 


-0.004078(24) 


-0.011736(67) 0.010956(62) 


gain 


22.1 


39.8 


48.9 


2.9 


Wolff dynamics 
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41.431(44) 


-0.06448(45) 


-0.1828(13) 


0.18634(60) 


imp 


40.829(18) 


-0.06362(18) 


-0.18572(46) 


0.18138(38) 


gain 


5.9 


6.3 


8.5 


2.5 


16 


139.70(23) 


-0.03294(37) 


-0.0942(11) 


0.08962(45) 


imp 


138.000(98) 


-0.03237(13) 


-0.09353(35) 


0.08848(31) 


gain 


5.4 


8.4 


9.6 


2.2 


32 


470.32(84) 


-0.01616(21) 


-0.04632(63) 


0.04359(27) 


imp 


469.01(35) 


-0.016312(77) 


-0.04691(21) 


0.04362(17) 


gain 


5.9 


7.6 


8.7 


2.4 


64 


1582.8(3.2) 


-0.00820(11) 


-0.02353(34) 


0.02179(15) 


imp 


1581.4(1.2) 


-0.008177(48) 


-0.02350(13) 


0.02175(10) 


gain 


6.5 


5.7 


6.7 


2.1 


128 


5327(12) 


-0.004102(70) 


-0.01176(21) 


0.010983(89) 


imp 


5321.2(4.3) 


-0.004075(29) 


-0.011696(81) 0.010883(63) 


gain 


7.7 


5.6 


6.5 


2.0 



TABLE I. Results for the two-dimensional (2D) Ising model with Metropolis and Wolff dynamics, separately. For each lattice size L, we 
compare standard analysis (first line), with the method explained in this work (imp). The approximate gain factor in CPU time is denoted with 
gain. 



This situation is illustrated in Fig.Q] The minimum length for 
80\\ — 5v is then obtained by choosing Sv such that <50|| — Sv 
is orthogonal to W. With this insight, the optimal coefficients 
{Xi} are calculated as 



A, 



R'T M -1 N _ 0> 



'M^RQ ^(M^N) (11) 



R' T M~ 1 R' 

where the matrix M and the vector N are defined as 

Mu =< 6Ri,SRj >, 



HI - 



(12) 
(13) 



and we have introduced the vector R'j = R^. The quantities 
involved in Eq. (TTTb can be estimated from MC data by using 
standard Jackknife techniques |9D. It is useful to observe that 



the covariances which enter in the definitions of M and N are 

j— i 

related to the transition matrix of the Markov chain [1|, and 
hence depend not only on the model, but also on the dynam- 
ics. Finally, it is worth noting that in order to correctly obtain 
the FSS limit, the coefficients {A^} must be the same for ev- 
ery size. They can be, however, chosen differently for each 
observable O considered. 

Results We have tested this method on the standard Ising 
model, which is defined on a d-dimensional lattice with linear 
size L by the Hamiltonian 



= ±1, 



(14) 



where the sum is over the nearest-neighbors sites and one can 
set J — 1. We consider here four RG-invariant quantities. 
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L 


X 


(dUi/d/3)/L d 


(dU 6 /dl3)/L d {dR i /dj3)/L d 


Metropolis dynamics 






8 


88.19(53) 


-0.1112(15) 


-0.4758(80) 0.07330(49) 


imp 


88.90(15) 


-0.11412(42) 


-0.5341(21) 0.08078(35) 


gain 


11.7 


12.3 


14.2 2.0 


16 


351.0(3.3) 


-0.04140(86) 


-0.1813(48) 0.02652(29) 


imp 


354.91(63) 


-0.04269(29) 


-0.19932(98) 0.02854(13) 


gain 


27.9 


9.1 


24.2 4.9 


32 


1382(14) 


-0.01574(36) 


-0.0697(20) 0.00992(11) 


imp 


1389.3(2.6) 


-0.015837(75) 


-0.07383(40) 0.010367(46) 


gain 


26.9 


23 


25.8 6.2 


64 


5373(46) 


-0.00612(12) 


-0.02731(70) 0.003686(34) 


imp 


5435.9(9.8) 


-0.005953(31) 


-0.02760(14) 0.003822(17) 


gain 


22.2 


16.1 


25.2 4.2 


128 


21537(200) 


-0.002220(48) 


-0.00981(27) 0.001412(15) 


imp 


21141(40) 


-0.002215(11) 


-0.010304(56)0.0014148(61) 


gain 


25.5 


19.7 


23. 6.1 


Wolff dynamics 
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87.82(28) 


-0.1098(11) 


-0.4682(55) 0.07244(40) 


imp 


88.48(11) 


-0.11608(37) 


-0.5418(20) 0.07995(25) 


gain 


6.3 


8.9 


7.7 2.7 


16 


350.7(1.2) 


-0.04273(45) 


-0.1880(23) 0.02704(16) 


imp 


352.83(47) 


-0.04306(17) 


-0.20094(90) 0.028609(91) 


gain 


7.0 


7.0 


6.7 3.1 


32 


1381.2(5.3) 


-0.01604(18) 


-0.07162(92) 0.009997(63) 


imp 


1388.4(1.9) 


-0.016051(76) 


-0.07521(39) 0.010391(36) 


gain 


7.8 


5.6 


5.5 3.1 


64 


5393(22) 


-0.006101(75) 


-0.02746(39) 0.003749(27) 


imp 


5438.6(7.3) 


-0.005943(33) 


-0.02769(16) 0.003824(15) 


gain 


9.5 


5.3 


6.3 3.2 


128 


21237(94) 


-0.002289(30) 


-0.01025(16) 0.001410(10) 


imp 


21188(29) 


-0.002291(14) 


-0.010308(64)0.0014145(59) 


gain 


10.1 


4.6 


5.8 3.1 



TABLE II. Same as in TableUfor the 3D Ising model. 



Besides the cumulants U4 and U$ defined by 

u * = Jm4' Mee ^>< (15) 

we also consider the ratio of the second-moment correlation 
length £ over L, = £/L, as well as the ratio of the par- 
tition function of a system with antiperiodic boundary con- 
ditions (b.c.) on one direction over the partition function of 
a system with full periodic b.c. Rz = Z a /Z p ; this quantity 
can be efficiently sampled using the boundary flip algorithm 
fiUl . In Eq. ( fT3T > the brackets ( ) denote the thermal av- 
erage. As for the observables, we consider here the suscep- 
tibility x = (M 2 )/L d , and the derivatives of RG-invariant 
quantities dU^/dp, dXJ§jdfi, dR^/dfi. At the critical point, 
as well as in the FSS at fixed phenomenological coupling, 
one has X °c i 2 "'' and dR/d(3 oc L 1 ^, R = U A ,U G ,R^ 
where 7/ and v are critical exponents. Thus these observables 



can be used to extract the value of the critical exponents. As 
mentioned after Eq. ( fTST l, the coefficients {A^} can be opti- 
mized separately for each observable. However, for a given 
observable they have to be the same for each lattice size. Of 
course, the coefficients determined from Eq. (Q~T} will be in 
general different for each lattice. Since in a typical simula- 
tion most of the computational time is spent on the largest 
lattice, a possible strategy for choosing the coefficients {Aj} 
consists in applying Eq. ( fTTT i to the largest available lattice, 
and then to use the same coefficients for the smaller lattices. 
We have found that the gain in the CPU time in most of the 
cases does not vary significantly with the lattice size, despite 
the fact that the chosen {Ai} are not the optimal ones for the 
smaller lattices. We have simulated the Ising model for lattice 
sizes L = 8-128 in two and three dimensions. As mentioned 
above, the statistical error bars and hence the gain in the CPU 
time resulting from the present method depend not only on 



5 



the model, but also on the dynamics used. To illustrate this 
aspect, we have considered data sampled using, separately, 
two different simulation algorithms: Metropolis and Wolff 
single cluster. Simulations in d = 2 have been carried out at 
P = 0.4406867935. The analysis at fixed phenomenological 
coupling has been done using the critical-point values U4 — 
1.167923(5), U 6 = 1.455649(7), Re = 0.9050488292(4) 
lfl2tl . and R z 0.3728848808 Ojj. The results are re- 
ported in Table U We compare the observables as determined 
with a standard analysis and with the present method (imp), 
with the coefficients determined, for each observable, from 
the data at L — 128. Notice that the mean values are not ex- 
pected to coincide in the two analyses, although they can be 
very close. The gain factor in the CPU time is given by the 
square of the ratio of the error bars. Inspecting Table |T|we 
observe large gains in the CPU time, especially for the ob- 
servables x, dUn/dfi, dUe/dp sampled with the Metropolis 
algorithm, where in most of the cases the gain lies between 
20 and 50; for the same observables, gains roughly between 
6 and 9 are obtained in the case of Wolff dynamics. The gain 
in the CPU time obtained for dR^ / d/3 is roughly ~ 2-3 for 
both algorithms. Simulations in d = 3 have been carried 
out at (3 = 0.2216544. FSS analysis at fixed phenomeno- 
logical coupling has been done using recent determinations of 
the critical-point values U 4 = 1.6036(1), U 6 = 3.1053(5), 
Rt = 0.6431(1), R z = 0.5425(1) Q. The results are re- 
ported in Table UTI The gains in the CPU time for x, dUn/df3, 
dUe/d/3 sampled with the Metropolis dynamics are in most 
of the cases between 20 and 30. For the other observables 



and for the Wolff dynamics the gains are similar to the bidi- 
mensional case. The present method represents also a signif- 
icant improvement over a simpler approach of fixing one of 
the four RG-invariant quantities: for instance by fixing R^ the 
CPU gain factor for \ is < 4 in d = 2, < 9 in d = 3 with 
Metropolis dynamics, and < 6 in d = 3 with Wolff dynamics. 

To summarize, we have presented a FSS method which al- 
lows for a substantial reduction of the statistical error bars 
without additional computational time. Application of the 
method to the Ising model shows that the CPU gain factor 
does not depend much on the lattice size, and it is more pro- 
nounced for a local Metropolis update algorithm. Other pos- 
sible interesting applications of this method should be found 
in "improved models" [3], where leading scaling corrections 
are suppressed, and in models with quenched disorder IU5II . 
where cluster algorithms are available only in special cases 
and significant reduction of error bars with FSS at fixed R^ 
has been reported H . 
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